The Data Science Handbook

The Data Science Handbook

Author:Carl Shan, Henry Wang, William Chen & Max Song
Language: eng
Format: mobi
Publisher: The Data Science Bookshelf
Published: 2015-05-02T05:00:00+00:00


John Foreman

Chief Scientist at MailChimp

Data Science is not a Kaggle Competition.

As an undergraduate math major, John thought that he was going to be a pure mathematician. A few experiences working as a programmer, combined with a talk with his advisor, pushed him instead into the world of applied math.

After a sojourn in academia through MIT’s Operations Research PhD program, John realized that a long-term career in industry would be more interested and fulfilling.

John held a series of jobs in business intelligence at various consulting companies, before taking on the Chief Scientist role at MailChimp, a fast-growing, completely bootstrapped, email startup based in Atlanta Georgia that boasts over 7 million users.

He is also the author of the book “Data Smart,” which presents an overview of machine learning techniques, as explained through spreadsheets.

Can you start off by talking about your book, “Data Smart”? How did the motivation for writing it come about, and what type of audience is it for?

I felt like there were a lot of business analysts and middle managers in the enterprise world who were not familiar with “data science” as a practice and set of techniques. These folks still lived in a world of “business intelligence” or “business analytics” from a decade ago, and I wanted to bring them up to speed on current methods (for example, ensemble AI models built on transactional data, data mining in graphs, forecasting with error bounds). I wanted to get these enterprise readers up to speed, so I needed to find a language and teaching approach that they’d understand.

A lot of data science books that were being introduced at the time required learning both R and techniques at the same time. With a lot of those books, rather than actually learning the techniques, you just loaded the AI package or the data mining package.

Instead, I wanted to write a book that introduced these concepts step-by-step with a tool the reader knew, and then, once they got it, slowly push them into a programming mode. So in Data Smart, I explained the gamut of data science techniques by using spreadsheets. Spreadsheets are kind of like a functional programming language and GUI in one, and they’re actually pretty good for step-by-step model building.

The last chapter is an introduction to R, and it harkens back to previous chapters now that the kernels of understanding had been planted. For example, if you’re doing an exponential smoothing forecast (which I cover in my book), you should not be doing all these steps every time. You should be doing it on the shoulders of the giants who’ve written the PhD theses you’re using and just open their package.

Ultimately, people who want to know each little detail of how a boosted tree model works or how modularity maximization works seem to love the book. Programmers who are used to relying on black-box libraries, functions, etc. aren’t the biggest fans.

Given your interest in opening up black boxes to examine the nitty-gritty of different techniques, did you ever want to write



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Popular ebooks
Eco-friendly approach of bio-indigo synthesis and developing purification methods towards isolation of indigo from indirubin and bacterial fragments by Ramalingam Manivannan & Kaliyan Prabakaran & Young-A Son(163610)
Personalized inhaled bacteriophage therapy for treatment of multidrug-resistant Pseudomonas aeruginosa in cystic fibrosis by unknow(157790)
Whisky: Malt Whiskies of Scotland (Collins Little Books) by dominic roskrow(74281)
CONSORT 2025 statement: updated guideline for reporting randomized trials by unknow(66082)
Critical evaluation of the ProfiLER-02 study design and outcomes by Vivek Subbiah & Razelle Kurzrock(65834)
Cardiac gene therapy makes a comeback by Oliver J. Müller & Susanne Hille & Anca Kliesow Remes(65272)
Unveiling the design rules for tunable emission in graphene quantum dots: A high-throughput TDDFT and machine learning perspective by Şener Özönder & Mustafa Coşkun Özdemir & Caner Ünlü(50860)
A yeast-based oral therapeutic delivers immune checkpoint inhibitors to reduce intestinal tumor burden by unknow(40226)
Covalent hitchhikers guide proteins to the nucleus by Alexander F. Russell & Madeline F. Currie & Champak Chatterjee(40192)
Meet the Authors: Christopher R. Mansfield and Emily R. Derbyshire by Christopher R. Mansfield & Emily R. Derbyshire(40058)
What's Done in Darkness by Kayla Perrin(27108)
Topological analysis of non-conjugated ethylene oxide cored dendrimers decorated with tetraphenylethylene: Insights from degree-based descriptors using the polynomial approach by A Theertha Nair & D Antony Xavier & Annmaria Baby & S Akhila(26485)
Investigation of mechanical and self-healing properties of hydroxyl-terminated polybutadiene functionalized with 2-ureido-4-pyrimidinone by Mohsen Kazazi & Mehran Hayaty & Ali Mousaviazar(26435)
The Ultimate Python Exercise Book: 700 Practical Exercises for Beginners with Quiz Questions by Copy(21018)
De Souza H. Master the Age of Artificial Intelligences. The Basic Guide...2024 by Unknown(20780)
D:\Jan\FTP\HOL\Work\Alien Breed - Tower Assault CD32 Alien Breed II - The Horror Continues Manual 1.jpg by PDFCreator(20650)
The Fifty Shades Trilogy & Grey by E L James(19607)
Shot Through the Heart: DI Grace Fisher 2 by Isabelle Grey(19487)
Shot Through the Heart by Mercy Celeste(19349)
Python GUI Applications using PyQt5 : The hands-on guide to build apps with Python by Verdugo Leire(17494)